Picture for Yu Kong

Yu Kong

H-Flow: Self-supervised Human Scene Flow via Physics-inspired Joint Multi-modal Learning

Add code
May 21, 2026
Viaarxiv icon

Spatially Prompted Visual Trajectory Prediction for Egocentric Manipulation

Add code
May 19, 2026
Viaarxiv icon

From 3D Pose to Prose: Biomechanics-Grounded Vision--Language Coaching

Add code
Mar 27, 2026
Viaarxiv icon

ViT-AdaLA: Adapting Vision Transformers with Linear Attention

Add code
Mar 17, 2026
Viaarxiv icon

IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios

Add code
May 27, 2025
Viaarxiv icon

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Add code
Apr 14, 2025
Figure 1 for H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Figure 2 for H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Figure 3 for H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Figure 4 for H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Viaarxiv icon

Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations

Add code
Apr 11, 2025
Viaarxiv icon

Window Token Concatenation for Efficient Visual Large Language Models

Add code
Apr 05, 2025
Viaarxiv icon

Visual Large Language Models for Generalized and Specialized Applications

Add code
Jan 06, 2025
Figure 1 for Visual Large Language Models for Generalized and Specialized Applications
Figure 2 for Visual Large Language Models for Generalized and Specialized Applications
Figure 3 for Visual Large Language Models for Generalized and Specialized Applications
Figure 4 for Visual Large Language Models for Generalized and Specialized Applications
Viaarxiv icon

LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation

Add code
Nov 22, 2024
Figure 1 for LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
Figure 2 for LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
Figure 3 for LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
Figure 4 for LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
Viaarxiv icon